Thermal denaturation and folding rates of single domain 

proteins: size matters 



Mai Suan Li^, D. K. Klimov^ and D. Thirumalai^ 

^Institute of Physics, Polish Academy of Sciences, Al. Lotnikow 32/46, 02-668 Warsaw, Poland 
'^Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742 



Abstract 

We analyze the dependence of thermal denaturation transition and folding 
rates of globular proteins on the number of amino acid residues, N. Using 
lattice Go models we show that AT/Tp ~ N^^, where Tp is the folding 
transition temperature and AT is the transition width computed using the 
temperature dependence of the order parameter that distinguishes between 
the unfolded state and the native basin of attraction. This finding is consistent 
with finite size effects expected for the systems undergoing a phase transition 
from a disordered to an ordered phase. The dependence of the folding rates 
on for lattice models and the dataset of 57 proteins and peptides shows 
that kp ~ kpexp{—CN^) with < /3 < 2/3 provides a good fit, where C is 
a /3-dependent constant. We find that kp ~ kpexp{—l.lN2^ with an average 
(over the dataset of proteins) kp {OAns)~^, can estimate optimal protein 
folding rates, to within an order of magnitude in most cases. By using this fit 
for a set of proteins with /3-sheet topology we find that kp ~ k^, the prefactor 
for unfolding. The maximum ratio of k'^/k^p ~ 10 for this class of proteins. 

I. INTRODUCTION 

Deciphering the factors that determine the foldability of protein sequences is 
an important problem from the perspective of protein design, protein structure prediction, 
and in vitro and in vivo protein folding. Foldability refers both to the folding rate, kp, and 
thermodynamics of the transition from the ensemble of unfolded states (U) to the native 
state or, more precisely, to the native basin of attraction (NBA). Folding rates and the 
associated equilibrium characteristics depend on intrinsic factors (sequence and topology) as 
well as on external conditions (pH, temperature, salt concentration, and viscosity). Variation 
in external conditions can not only alter the rates, but also the mechanism of folding. Despite 
this obvious fact most of the studies have been focused on the dependence of kp solely on 
the characteristics of the native states as described by the crystal (or NMR) structures. 

The role of finite size effects on the thermodynamics of protein folding has received 
very little attention. The emphasis on the cooperativity of the transition from U state to 
NBA seems to have precluded consideration of the role of A^, the number of residues in a 
sequence. This transition, for apparent two-state folders, has all the hallmarks of (weak) 
first-order phase transition. The highly cooperative U^NBA transition has lead some 
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authors to suggest that there is no evidence that partially structured states contribute to 
the thermodynamic properties of proteins. Computational studies have shown [^] that in 
/3-hairpin forming sequence from the C-terminus of GBl protein structure is acquired over 
a finite range of temperatures, even though the overall folding can be described as a broad 
"two-state" transition [0]. Experiments on refolding of barnase have also suggested that 
structure is lost incrementally upon temperature induced unfolding [§]. Direct temperature 
dependence of structure formation in leucine zipper using one dimensional NMR experi- 
ments has established that melting temperature varies across the structure [^j. Although 
the variations occur over a relatively narrow range of temperatures, it is clear from these 
experiments that because of the finite size of proteins partially folded structures contribute 
to folding thermodynamics. These observations warrant an examination of finite size effect 
on the U ^ NBA transition. Building on our previous study |]T0|, we further investigate 
the role of in thermal denaturation using lattice models of proteins. 

It has been noted [|T^] that kp correlates well with the relative contact order (RCO), which 
measures the proximity of side chain contacts in the folded state. The notion that protein 
folding is initiated with residues forming local structures and, thus, is determined by their 
proximity along the sequence constitutes the basis of the hierarchical folding mechanism |]12| . 



Thus, in retrospect, the correlation between the RCO and kp is not entirely unanticipated, 
especially in a-helical proteins. Although RCO is an important indicator of the folding rates, 
it should be pointed out that there is little correlation between RCO and kp for proteins 
with /3-sheet topology. Clarke et al |jl3| showed that neither kp nor the unfolding rates ku 
correlate with RCO for a class of /3-sheet proteins belonging to the immunoglobulin (Ig) fold. 
The RCO for the 6 proteins examined [0 is in the very narrow range (0.17 < RCO < 0.20). 
Nevertheless, the refolding rates for these proteins vary by a factor of 800. More recently, 
Clarke and coworkers have shown that for a number of Ig domains from the muscle protein 
titin kp can vary by over four orders of magnitude [|I4|, although their RCO values are 
expected to be nearly the same. These studies show that factors besides RCO play an 
important role in the determination of kp. 
Surprisingly, it was initially suggested 
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teins plays a role in determining kp. These counterintuitive observations contradict several 
theoretical [|l^,0,|l^,|l^ and a few experimental studies [|13|- More careful examination of 
the database of well characterized proteins has shown that, although there are exceptions 
T9| , stability is an important factor that determines kp [0,^]. Recently, several studies 



!T|,|22[] have concluded that A^, the number of amino acids, must also play an important role 
in determining kp. In this paper we examine the dependence of rates as well as thermal 
denaturation of single domain proteins on A^. 

Beginning with the paper by one of us |jl5[ a number of theoretical studies [lT6| , p!7| , p!8| have 
predicted that A^ should play a significant role in controlling kp. Given that polypeptide 
chains are heteropolymers we expect that their relaxation times in both the folded and un- 
folded states must depend on A^. Theoretical studies |1l5| , p!6| suggest that the dependence of 
kp on N is dictated by the interplay of three characteristic temperatures of the polypeptide 
chain, namely, Tp (the folding transition temperature), Te (the collapse transition temper- 
ature), and Tg (the glass transition temperature). It appears that in most experiments 
the external conditions are such that fastest folding is observed near the " tricritical" point, 
where Tp ^ Tg in accord with the prediction by Camacho and Thirumalai P3. For near 



2 



optimal folding, as it may be the case for minimally frustrated sequences, it has been argued 
that 



Inikp/k^p) ~ alnN, (1) 



where a ~ 4 and kp is an undetermined prefactor. For artificial Go models a ~ 3 
T^ , |2^ . On the other hand, due to topological frustration, even the sequences following 



two-state kinetics have a rough energy landscape. In this case 

HkF/k'p) ~ N^- (2) 
The value of P has been suggested to be less than unity and is probably in the range 



0.5 < /5 < I P,|T6|JT^. Given the limited range of for single domain proteins it is difficult 



(see below) to determine (3 precisely. 

To probe finite size effects on thermally induced folding we have performed Monte Carlo 
simulations using Go lattice models. These results are used to quantitatively establish the 
effect of finite on rounding the U ^ NBA transition. A dataset of proteins, for which 
kp is available, is used to draw lessons on the dependence of kp on N. Using these results 
we show that unambiguous determination of f3 is not possible. However, we argue that the 
N dependence given in Eq. (2) is useful in analyzing the experimental data. As a byproduct 
of this work we also provide estimates of the folding and unfolding prefactors, kp and kfj. 



II. MODELS AND METHODS 

For the numerical simulations we represent a polypeptide chain using lattice Go model 
without side chains. The energy of a conformation 

^ = J2'^iJ^n,,a, (3) 

i<j 

where a is a lattice spacing, is the distance between non-bonded beads i and j, and the 
contact energies etj are chosen to be -1 for native contacts and for non-native ones. Go 
models are useful in exploring general physical principles that govern protein folding under 
the condition of marginal stability of the native state [^,^. The sequences were selected by 
a standard sequence space Monte Carlo algorithm, which maximizes the Z-score for a given 
target structure. The target structures for each N were chosen to be maximally compact. 
For example, for A = 18 and A = 80 the native structures occupy the vertexes of 3x3x2 
and 4x4x5 cubes, respectively. 

The thermodynamics of folding were determined using Monte Carlo simulations based on 
MS3 move set |]27| , p8| , |29| , |30| , which involves single, double and triple bead moves. Because 
this move set involves multiparticle updates, it is much more efficient compared to the 
standard move set p9| , |30|j3T[| . The thermodynamic properties of the sequences are calculated 
using the multiple histogram method Typical number of Monte Carlo trajectories used 



to collect histograms is 50-100 depending on A. The free energy is calculated as a function 
of the number of native contacts Q, which is treated as an approximate reaction coordinate 
for Go models. This allows us to estimate the dependence of folding and unfolding free 
energy barriers on A. 
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For lattice models the structural similarity with the native conformation is measured by 
the overlap function [23| 



1 ^ 

^ = ^- iV^-3iV + 2 (4) 

where the superscript refers to the native state. The folding temperature Tp is defined as 
a temperature at which d < x> /dT is maximum and the transition width AT is defined 
as the full width at half maximum oi d < x> /dT at T = Tp. 



III. RESULTS 

A. Finite size effects in thermal denaturation: The transition width AT is obtained 
from the temperature dependence of < x > /dT (see Fig. |l|a for an example). For all 
the sequences considered here Tp ~ Tg. For finite size systems the U ^ NBA transition 
is expected to be rounded. The rounded nature of the transition which has been seen in 
simulations, is reflected in the temperature dependence of d < x > /dT (Fig. |T]a). More 
importantly, we expect AT/Tp to scale as 

AT 

^ - N-\ (5) 

The data for lattice Go models show that AT/Tp ~ A^"^ with A = 1.2 ± 0.1 (Fig. |l|b). 
The small deviation from the expected theoretical result (Eq. (^)) may be a consequence 
of the relatively small < 80 in the sample. For small values of A^ the native state does 
not have a well-defined core. As a result fluctuations are relatively large, which may explain 
the observed deviation. Analysis of the experimental data indeed shows that (Eq. (^) is 
obeyed with great precision []10| , |33| . 

B: A^ dependence of folding and unfolding barrier heights at Tp for Go models. 
To compute the free energy folding barriers, ATp(~ AT^, the unfolding barrier, at Tp) it 
is necessary to define a reaction coordinate. The precise reaction coordinate for a multi- 
dimensional process such as protein folding is difficult to ascertain. However, Onuchic and 
coworkers [^] have argued that, for minimally frustrated systems such as the Go models, the 
fraction of native contact Q may be appropriate. Accordingly, we have computed F{Q) for 
about 80 sequences with A^ ranging from 18 to 80. This is the largest number of sequences 
used so far to test the expected scaling of ATJ, and AT^. At Tp, rOexp(AF|/fcBTp) = 
Tjjexp{AFij / ksTp) . Because it is not obvious that T^^T^, AT| and AFI may, in principle, 
exhibit different scaling behavior with A^. 

From the typical free energy profile F{Q) (Fig |a) we computed AF^ and AF^. The 
variation of AFp / ksT function of InA^, A^^/^ g^^^ j^2/3 Qq 

sequences plotted in 

Fig. |pD,c,d, respectively, shows that all three fits quantitatively reproduce the simulation 
results. However, we argue below using the analysis of experimental data that AFp ~ InA^ 
is not viable. Based on experimental estimates of Tp and T^ we find that AT| ~ N^^^ 
provides the best physically acceptable representation of the data. From the lattice model 
computations we find AFp and AF^ have the same dependence on A^, which implies that 

T-O ^ _o 
Tp ~ Tjj. 



4 



C. Chain length dependence of folding rates: The RCO, which is a characteristic of 
the native topology of proteins, is 



RCO = (6) 

where |^ — j| is the sequence separation between the residues i and j and A^- is unity, if 
i and j form a native contact, or zero, otherwise. The observed correlation between RCO 
and In kp suggests that folding is most rapid, if the native state has a large fraction of 
local contacts. The importance of RCO is based on the sound physical idea that residues 
local in sequence space tend to form interactions early in the folding process and, if these 
substructures can "coherently" add to produce the folded structure, efficient folding may 
be realized. However, almost all proteins are stabilized by a sizable fraction of long-ranged 
(non-local) contacts. This suggests that Inkp may also depend on other factors (for example, 
stability |]2D| and A^) besides RCO. Lattice model simulations and experiments [|T3| have 



shown that native state stability is also a contributing factor to refolding rates. 

Depending on the extent of energy frustration one of us suggested [jl5| that kp ^ 
(for optimized sequences) or kp ~ exp(— CiA^a), where Ci is a constant (Eqs. (1,2)). By 
balancing the "bulk" free energy gain due to the formation of a stable hydrophobic core 
and the surface tension cost due to interface formation it has been proposed that for 
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optimal folding kp ~ exp(— C2iV3), where C2 is a constant. Although, the limited range of 
values accessible in proteins makes it difficult to unambiguously determine the precise way 
kp decreases upon increasing A^, it is generally agreed that free energy barriers in proteins 
shall be relatively small. Moreover, the transition region could be broad with roughness 
superimposed on it. As a result AFp/ksTp is expected to grow only as A^'^ with P < 1. The 
sublinear growth of AFp/ksT with respect to A^ naturally explains both the rapid folding 
(kinetics) and marginal stability (thermodynamics) of folded states of proteins. 

Recently, Koga and Takada have computed folding rates for 18 proteins using C^-Go 



models. They fit the data using kp ~ expi-CsRCO x A^^) with /3 = 0.607±0.179 and C3 is 
a constant. Within the error bar of their fit it is impossible to distinguish between jS = 0.5 
or 2/3. Their results showed, as argued on theoretical grounds, that f3 < 1. In addition. 



due to the possibility that RCO decreases with A^ |22| it is likely that the actual value of 



in is considerably smaller. By focusing on the proteins that fold by three-state kinetics 
Galzitskaya et al. have argued that chain length A^ is the major determinant of folding 
rates. However, they were unable to determine the precise dependence oi kp on N . 

Ivankov et el. have reconsidered chain length dependence of fc^ by analyzing experi- 
mental data for 57 proteins (both two and three state folders) and peptides. They suggested 
that In kp ~ —QAARCO x A^ + 11.15 for the set of 57 proteins with the correlation co- 
efficient p = 0.74. For this dataset of proteins it is argued that RCO ~ A^~°'^, so that 
kp ~ exp(— C4 X A^°-^), where C4 is a constant. Because there are errors in fitting RCO 
to a power law decay with A^, the indirect inference that (3 ~ 0.7 is not transparent. To 
circumvent this problem we have directly examined the dependence of Inkp on A^. The fit 
of Inkp using the theoretically proposed models are shown in Fig. b and c. The corre- 
lation coefficient for the fits Inkp ~ A^^ is nearly constant for < /5 < 2/3 and begins to 
decrease modestly for (3 > 2/3 (Fig. |^). We have also established that the folding rates in 
lattice Go models can be adequately fit with (3=0, 0.5, or 2/3 ||3^. From this perspective 
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alone it is difficult to distinguish between the three theoretical values of P (0, 0.5, and 2/3). 
However, we rule out Inkp ~ InA^ (Fig- ib) based on the following arguments: (1) The 
power law fit yields Inkp = —5.5 IniV + 28.5 which implies kp ^ e^^'^s"^ = {OAps)~^. This 
value for the prefactor kp is nearly the same as kBT/h ~ (0.2ps)~^, which is reasonable for 
small molecules, but is not appropriate to describe folding reactions. (2) The value of the 
exponent a = 5.5 is too large to be justified theoretically. Such a large value of a is usually 
indicative of an underlying activated process with a relatively small barrier PD|. 

The fits to the data in Fig. (3) cannot distinguish the scalings of Inkp with N^^'^ or N'^^^. 
This is consistent with our results presented in Fig. ^. In an attempt to further discriminate 
between (3 = 0.5 and (3 = 2/3 we focus on the numerical values of the prefactor kp. The 
inverse of the prefactor l/kp for the N^^'^ scaling of the barrier height is 0.4/is, whereas 
1/kp ^ 8/is for A^^/^ scaling (see caption to Fig. By applying Kramer's theory to describe 
the U ^ NBA transition it has been argued that tq = l/kp should be considerably greater 
than h/ksT P7| , P5|JS^ . The range 0.4/is < tq < 8/is obtained from the two fits is consistent 
with this expectation. Therefore, it follows that, unless a direct experimental measurement 
of k% is made, it would be difficult to determine the precise value of (3. The goodness of fits 
with /3=l/2or/5 = 2/3 shows clearly that barriers to folding scale sublinearly with N. 

D. Prefactors for folding and unfolding. There is considerable interest in obtaining 
a fairly accurate estimate of t]?.(~ (^f)~^) near neutral pH and T = 25°C so that the 
measurements of average barrier heights can be made directly. Estimates of have been 
made using few physically motivated arguments: 

(1) Assuming that the most elementary step in the folding process is the formation of a 
single tertiary contact (a loop between two residues separated by / intervening residues) it 
was argued that the speed limit for folding is about 1/is [|^. Because most probable loops 
are predicted to form in about it follows that Tp Ifis. Eaton and coworkers 



1[ have provided additional arguments that proteins are unlikely to fold faster than 

Tp ^ 

(2) Yang and Gruebele |^ argue using refolding data of mutants of a helical protein Ag-ss 
that Tp ~ 2/xs. We believe that, in general, for the majority of proteins Tp ^ 2/is should be 
near the upper limit for the following reasons. Based on theories of collapse dynamics we 
expect that the 80 residue protein Ag-ss becomes compact in about ^ (770/7) A^ ^ 1.5/is, 
where rj is the solvent viscosity, a is the Flory characteristic ratio, 7 is the surface tension 
(50 cal/molA^), 9 ~ 2.2 and A = 80. This estimate is close to the folding time for Ae-ss, 
which suggests that collapse and folding are nearly simultaneous for this protein. Because 
these two processes cannot be separated for proteins with A = 80 that fold in about ~ l/is, 
it appears that one can assume that Tp ^ 2/is may be an upper bound. We believe that 
Tp ^ Ifis could serve as a practical estimate for the prefactor, because on time scales greater 
than Ifis multiple loops can form and collapse of the entire polypeptide chain can occur, 
which could obfuscate direct determination of Tp. In arriving at these estimates for Ag-ss we 
have assumed that internal viscosity does not alter folding rates appreciably. Although, a 
similar observation has been made for refolding of protein L [|^| and CspB it is unclear 
how important internal viscosity of proteins is in the determination of Tp [^. In addition, 



external conditions can alter Tp. Thus, Tp ^ 1/is should be taken merely as a useful estimate 
for the prefactor. 

The dependence of In /c^ on A (Fig. (3)) allows us to estimate Tp and Tjj using the 
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experimental data for proteins that are not fully represented in Fig. (3). We use refolding 
rates for several /?-slieet proteins to estimate Tp (Table 1). Assuming N^^"^ scaling we find 
that Tp is in the range (0.1 — 18)/is. Except for Tp ~ 18/is obtained for twitchin (TWIglS') 
with low native state stability the average value of the prefactor is Tp 3.5/is. If we 
use T^p ~ rFexp(-0.36iV§) (Fig. (3)), we find 2/is < T^p < 400/is (Table 1). For four 
immunoglobulin proteins with the exception of FNfnlO (Table 1) the estimated values of Tp 
using the iVs scaling for the barrier height seem too large. Thus, Tp appears to be in the 
neighborhood of few iis for the /3-sheet proteins and for the a-helical protein Ae-ss- 

Another question of interest is whether TpK-T^l Using lattice model simulations we have 



previously argued that the unfolding and folding prefactors are similar |^ . This conclusion 
was reached using the number of native contacts Q as a reaction coordinate. It is unclear 
whether this result is a consequence of our choice of the reaction coordinate. The results in 
Fig. (3) and the measured unfolding rates in Table 1 allow us to directly estimate 

r° ~ rc/exp [-(LIA^^^ + ^ag)], (7) 

where tu is the unfolding time, AG is the free energy of stability of the native state, and 
P = {ksT)'^. With the exception of TWIglS' the ratio T^j/Tp < 1 and is in the range 
0.1 ^ Tjj/Tp ^ 1.0. For this class of proteins the maximum value of Tp/Tjj < 10 (Table 1). 
Similar conclusions have been drawn for a-helical proteins as well. Thus, it appears that 



IV. CONCLUSIONS 

In this article we have considered finite size effects in thermal denaturation and folding 
kinetics. We have established using lattice models that the rounded transition as quantified 
by AT/Tp obeys the expected scaling (Eq. (|^)). This is in accord with the earlier analysis 
of the experimental data [|1^], which further suggests that qualitative features of folding 
transition can be gleaned using lattice models. Unlike the case of thermal denaturation the 
situation is far more ambiguous when the scaling of kp with is considered. The dependence 
of In fc^T' on A^ does not match the quality of correlation noted for thermodynamics. If we 
delete the fastest folding proteins and peptides and the slowest folding proteins from the 
dataset in Fig. |^, the correlation coefficient becomes considerably worse (~ 0.56) regardless 
of the scaling (/? = 0.5 or 2/3) used. Nevertheless, the inclusion of the A^ dependence does 
improve the correlation between Inkp on RCO Using the expected values (from a 



number of unrelated studies) for the prefactor, we suggest that the N^^"^ scaling for barrier 
height AF^/ksT may be useful in making order of magnitude estimates of refolding rates. 
This scaling also implies that the energy landscape of two-state proteins is rugged. The 
energy scale for roughness may be of order of a few ksT. 
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Table 1. Estimates of the folding and unfolding prefactors" 



protein'' 


/3AG^ 






^F 


^F 






TI 127 (89) 


12.7 


0.0313 


2041 


0.974 


23.9 


0.194 


4.76 


TWIgl8'(93) 


6.9 


0.667 


3571 


16.5 


412 


89.0 


2220 


CD2dl(98) 


11.5 


0.0556 


588 


1.04 


26.4 


0.111 


2.83 


TNfn3 (92) 


9.1 


0.344 


2174 


9.00 


224 


6.35 


158 


FNfnlO (96) 


15.9 


0.00417 


4348 


0.0870 


2.20 


0.0113 


0.285 


CspB {B.subtilis){Q7) 


4.6 


0.00145 


0.101 


0.178 


3.82 


0.125 


2.68 


CspB (B.caldolyticus) (66) 


8.1 


0.000730 


1.56 


0.0960 


2.04 


0.0623 


1.32 


CspB {T.maritima) {68) 


10.6 


0.00177 


55.6 


0.203 


4.40 


0.159 


3.44 



(a) Data for the first five proteins are from |T^ and the data for CspB proteins are from 

(b) Numbers in parenthesis are the values of N 

(c) Free energy of stability extrapolated to zero denaturant concentration 

(d) Folding times in seconds 

(e) Unfolding times in seconds 

(f) Folding prefactor (in units of fis) calculated using Tp = r^? exp(— l.liV^/^) 

(g) Folding prefactor (in units of /is) calculated using Tp = Tp ex-p(—0.36N'^^^) 

(h) Unfolding prefactor (in /is) calculated using r^} = r^y exp(— I.IA^^/^ — /S.G / {ksT)) 

(i) Unfolding prefactor (in /xs) calculated using = r^/ exp(— 0.36iV^/'^ — AG / (ksT)) 
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Figure captions 



Fig. (1) (a) Temperature dependence of c? < x > /dT for the lattice sequence with 
N = 64. The folding transition temperature is identified with the peak in d < x > /dT. 
The full width at half-maximum is indicated by AT. (b) The dependence of AT/Tp as a 
function of A^. The straight line gives the fit AT/Tp ~ A^^ with A = 1.2 ± 0.1. 

Fig. (2) (a) Dependence of F/ksT {F is the free energy of a sequence) as a function of 
the presumed reaction coordinate Q, the number of native contacts, for one of the = 64 
Go sequences. The unfolding and refolding barriers are extracted from the free energy profile 
as indicated. Panel (b) shows the fit AFp/ksTF ~ InA^. Panels (c) and (d) correspond to 
the fits AFl/ksTF ~ with 13 = 0.5 and 2/3, respectively. The results were computed 
for = 18(20), 27(17), 36(18), 48(18), 64(15), and 80(12), where the number in parenthesis 
refers to the number of sequences used for averaging AFp/ksTp. Similar scaling with A^ is 
obtained for AFl/keTp. 

Fig. (3) Fits of In /ci? as a function of A^ for the dataset of 57 proteins and peptides 
taken from ref. Cross and hexagon symbols correspond to three and two state folders, 
respectively, (a) The fit based on In kp ~ A^2. The straight line is ?/ = —l.lx + 14.7 and 

2 

the correlation coefficient is 0.71. (b) The fit based on In kp ~ N^. The straight line is 
y = —0.362; + 11.7 and the correlation coefficient is 0.70. (c) Fits of In kp ~ InA^ gives 
y = —5.5a; + 28.5 with the correlation coefficient of 0.72. (d) Variation of the correlation 
coefficient with (3. The correlation becomes weaker at /? > 2/3. 
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